Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
The wide array of currently available genomes displays a wonderful diversity in size, composition, and structure and is quickly expanding thanks to several global biodiversity genomics initiatives. However, sequencing of genomes, even with the latest technologies, can still be challenging for both technical (e.g., small physical size, contaminated samples, or access to appropriate sequencing platforms) and biological reasons (e.g., germline-restricted DNA, variable ploidy levels, sex chromosomes, or very large genomes). In recent years,k-mer-based techniques have become popular to overcome some of these challenges. They are based on the simple process of dividing the analyzed sequences (e.g., raw reads or genomes) into a set of subsequences of lengthk, calledk-mers, and then analyzing the frequency or sequences of thosek-mers. Analyses based onk-mers allow for a rapid and intuitive assessment of complex sequencing data sets. Here, we provide a comprehensive review to the theoretical properties and practical applications ofk-mers in biodiversity genomics with a special focus on genome modeling.more » « lessFree, publicly-accessible full text available January 31, 2026
-
Abstract Pan-genomics and genome-editing technologies are revolutionizing breeding of global crops1,2. A transformative opportunity lies in exchanging genotype-to-phenotype knowledge between major crops (that is, those cultivated globally) and indigenous crops (that is, those locally cultivated within a circumscribed area)3–5to enhance our food system. However, species-specific genetic variants and their interactions with desirable natural or engineered mutations pose barriers to achieving predictable phenotypic effects, even between related crops6,7. Here, by establishing a pan-genome of the crop-rich genusSolanum8and integrating functional genomics and pan-genetics, we show that gene duplication and subsequent paralogue diversification are major obstacles to genotype-to-phenotype predictability. Despite broad conservation of gene macrosynteny among chromosome-scale references for 22 species, including 13 indigenous crops, thousands of gene duplications, particularly within key domestication gene families, exhibited dynamic trajectories in sequence, expression and function. By augmenting our pan-genome with African eggplant cultivars9and applying quantitative genetics and genome editing, we dissected an intricate history of paralogue evolution affecting fruit size. The loss of a redundant paralogue of the classical fruit size regulatorCLAVATA3(CLV3)10,11was compensated by a lineage-specific tandem duplication. Subsequent pseudogenization of the derived copy, followed by a large cultivar-specific deletion, created a single fusedCLV3allele that modulates fruit organ number alongside an enzymatic gene controlling the same trait. Our findings demonstrate that paralogue diversifications over short timescales are underexplored contingencies in trait evolvability. Exposing and navigating these contingencies is crucial for translating genotype-to-phenotype relationships across species.more » « lessFree, publicly-accessible full text available April 3, 2026
-
The combination of ultra-long (UL) Oxford Nanopore Technologies (ONT) sequencing reads with long, accurate Pacific Bioscience (PacBio) High Fidelity (HiFi) reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, “telomere-to-telomere” genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT “Duplex” sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used “Pore-C” chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the UL reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and provides a multirun single-instrument solution for the reconstruction of complete genomes.more » « less
-
An enduring question in evolutionary biology concerns the degree to which episodes of convergent trait evolution depend on the same genetic programs, particularly over long timescales. In this work, we genetically dissected repeated origins and losses of prickles—sharp epidermal projections—that convergently evolved in numerous plant lineages. Mutations in a cytokinin hormone biosynthetic gene caused at least 16 independent losses of prickles in eggplants and wild relatives in the genusSolanum. Homologs underlie prickle formation across angiosperms that collectively diverged more than 150 million years ago, including rice and roses. By developing newSolanumgenetic systems, we leveraged this discovery to eliminate prickles in a wild species and an indigenously foraged berry. Our findings implicate a shared hormone activation genetic program underlying evolutionarily widespread and recurrent instances of plant morphological innovation.more » « less
-
Abstract An enduring question in evolutionary biology concerns the degree to which episodes of convergent trait evolution depend on the same genetic programs, particularly over long timescales. Here we genetically dissected repeated origins and losses of prickles, sharp epidermal projections, that convergently evolved in numerous plant lineages. Mutations in a cytokinin hormone biosynthetic gene caused at least 16 independent losses of prickles in eggplants and wild relatives in the genusSolanum. Strikingly, homologs promote prickle formation across angiosperms that collectively diverged over 150 million years ago. By developing newSolanumgenetic systems, we leveraged this discovery to eliminate prickles in a wild species and an indigenously foraged berry. Our findings implicate a shared hormone-activation genetic program underlying evolutionarily widespread and recurrent instances of plant morphological innovation.more » « less
An official website of the United States government
